Multilevel Refinement for Hierarchical Clustering

نویسندگان

  • George Karypis
  • Vipin Kumar
چکیده

Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglomerative process are not necessarily the right ones. One possible solution to this problem is to refine a clustering produced by the agglomerative hierarchical algorithm to potentially correct the mistakes made early in the agglomerative process. The problem of refining a clustering has many similarities with that of refining a min-cut k-way partitioning of a graph. In this paper, we explore multilevel refinement schemes for refining and improving the clusterings produced by hierarchical agglomerative clustering. This algorithm combines traditional hierarchical clustering with multilevel refinement that has been found to be very effective for computing min-cut k-way partitioning of graphs. We consider several clustering objective functions for the proposed refinement step and investigate the usefulness of these objective functions. Our experimental results demonstrate that this algorithm produces clustering solutions that are consistently and significantly better than those produced by hierarchical clustering algorithms alone. Furthermore, our algorithm has the additional advantage of being extremely fast, as it operates on a sparse similarity matrix. The amount of time required by our algorithm ranged from two second for a data set with 358 items, to 80 seconds for a data set with 9133 items on a Pentium II PC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

2 Review of Agglomerative Hierarchical Clustering Algorithms

Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...

متن کامل

Multilevel Approaches applied to the Capacitated Clustering Problem

This paper presents two multilevel refinement algorithms for the capacitated clustering problem. Multilevel refinement is a collaborative technique capable of significantly aiding the solution process for optimisation problems. The central methodologies of the technique are filtering solutions from the search space and reducing the level of problem detail to be considered at each level of the s...

متن کامل

Parallel Multilevel Tetrahedral Grid Refinement

In this paper we analyze a parallel version of a multilevel red/green local refinement algorithm for tetrahedral meshes. The refinement method is similar to the approaches used in the UG-package [33] and by Bey [11, 12]. We introduce a new data distribution format that is very suitable for the parallel multilevel refinement algorithm. This format is called an admissible hierarchical decompositi...

متن کامل

Single ‎A‎ssignment Capacitated Hierarchical Hub Set Covering Problem for Service Delivery Systems Over Multilevel Networks

The present study introduced a novel hierarchical hub set covering problem with capacity constraints. This study showed the significance of fixed charge costs for locating facilities, assigning hub links and designing a productivity network. The proposed model employs mixed integer programming to locate facilities and establish links between nodes according to the travel time between an origin-...

متن کامل

Artificial Multilevel Boundary Element Preconditioners

A hierarchical multilevel preconditioner is constructed for an efficient solution of a first kind boundary integral equation with the single layer potential operator discretized by a boundary element method. This technique is based on a hierarchical clustering of all boundary elements as used in fast boundary element methods. This hierarchy is applied to define a sequence of nested boundary ele...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999